A Better Confidence Interval for Proportions

The Problem with Small Samples

The CLT guarantees that the sample proportion \(\hat{p}\) is asymptotically normal, but the confidence interval is not guaranteed to be between \([0, 1]\), and is generally not efficient. This post demonstrates how the Delta Method can yield a much better confidence interval, and the calculation required to do so is very simple.

Under the Central Limit Theorem, \(\hat{p} \sim \text{N}(0, \frac{p(1-p)}{n})\) as \(n\rightarrow\infty\). We can use this to give a confidence interval for p: \[\widehat{\text{CI}(p)}=\hat{p} \pm q_\alpha\sqrt{\frac{p(1-p)}{n}}\] where \(\alpha\) is the false positive rate and \(q_\alpha:=\int_{x=1-\alpha/2}^\infty \mathbb{P}(N(0, 1) = x)\). But for small \(n\), this won't be exact because \(p\in [0, 1]\), but \(\widehat{\text{CI}(p)}\) could be outside \([0, 1]\). Generally, for small \(n\), we don't have any guarantee that \(100\cdot(1-\alpha)\%\) of the confidence intervals will contain \(p\).

New Method

For \(x_1, ..., x_n \sim X\) with mean \(\mu\) and variance \(\sigma^2\), the CLT says: \[\sum x_i/n \sim N(\mu, \sigma^2/n)\] as \(n\rightarrow \infty\). Let's say we want to transform the sample average by appling a function \(f\) to it. The CLT still applies! Specifically:

\[f(\sum (x_i)/n) \sim N(f(\mu,\sigma^2/n\cdot \left[f'(\mu)\right]^2)\]

This is called the Delta Method. So here's an idea for how to make a better confidence interval for the proportion: I'll tranform \(\hat{p}\) to range between \((-\infty, \infty)\), find a confidence interval for that, then undo the transformation to the resulting lower and upper bounds so that they are also inside \([0, 1]\). Any function \(f: [0, 1] \rightarrow (-\infty, \infty)\) will work, but I'll pick one similar to what we saw in class: \(f(p)=\text{log}(\frac{p}{1-p})\).

If you do the math, you'll see that \(f'(p) = \frac{1}{p(1-p)}\), and \(f^{-1}(z) = \frac{e^z}{1+e^z}\). So the Delta Method says the CLT on the transformed proportion is:

\[f(\hat{p}) \sim N\left(f(p),\ \frac{1}{p(1-p)n}\right)\]

and the transformed confidence interval is:

\[(\text{lower}, \text{upper}) = f(\hat{p}) \pm q_\alpha\sqrt{\frac{1}{\hat{p}(1-\hat{p})n}}\]

To put get back a new confidence interval for \(\hat{p}\), I'll apply \(f^{-1}\) to the confidence interval bounds. That is, my new confidence interval is:\[\left(\frac{e^\text{lower}}{1+e^\text{lower}},\ \frac{e^\text{upper}}{1+e^\text{upper}}\right)\]

Performance

For a range of true proportions \(p\) and sample sizes \(n\), we would like a confidence interval to actually contain the true proportion \(100\cdot(1-\alpha)\%\) of the time. No more, no less. The following plots show that the traditional use of the CLT is reliably off– specifically, it under-covers the true proportion in small sample sizes. The new method is pretty much spot on.

Finally, I'm adding the code I wrote to create these simulations and plots.

Plots

Python Code


import numpy as np
import scipy.stats as ss
import copy

def test(n, B, p, t=.01, alpha=.05):
    q = ss.norm(0, 1).ppf(1 - alpha/2)

    ### traditional method
    ph = ss.binom(n, p).rvs(B) / n
    old_coverage = np.logical_and(
        ph - q * np.sqrt(ph*(1-ph)/n) < p,
        ph + q * np.sqrt(ph*(1-ph)/n) > p
    )

    ### new method
    # nuisance: truncate so ph strictly between 0 and 1
    pht = copy.deepcopy(ph)
    pht[pht == 0] = t
    pht[pht == 1] = 1-t
    
    lower = np.log(pht/(1-pht)) - q  / np.sqrt((n * pht * (1-pht)))
    upper = np.log(pht/(1-pht)) + q  / np.sqrt((n * pht * (1-pht)))

    new_coverage = np.logical_and(
        np.exp(lower)/(1+np.exp(lower)) < p,
        np.exp(upper)/(1+np.exp(upper)) > p
    )
    return np.mean(old_coverage), np.mean(new_coverage)

ns = (1, 2, 3, 4, 5,
      10, 15, 20, 25, 30,
      40, 50, 60, 70, 80, 90, 100,
      200, 300, 400, 500, 1000)

ps = (.01, .05, .1, .2, .3)
tests = [np.array([test(n, B=1000000, p=p, t=.0001) for n in ns]) for p in ps]

for example_p_ix in range(5):
    plt.semilogx((0, max(ns)), (.95,)*2,
                 linewidth=2, linestyle="--", c="#000000")
    plt.plot(ns, tests[example_p_ix][:,0], "o-",
             label="Traditional Use of CLT")
    plt.plot(ns, tests[example_p_ix][:,1], "o-",
             label=r"CLT+Delta Method on log($\frac{p}{1-p}$)")
    plt.xlabel("Sample Size")
    plt.ylabel("Coverage")
    plt.title(f"Coverage for two estimates of the confidence interval\nTrue p
              = {ps[example_p_ix]}")
    plt.legend()
    plt.savefig(f"~/coverages/plot_{example_p_ix}.pdf",
                format="pdf", bbox_inches="tight")
    plt.close()

← Back to Technical Writings